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Abstract —In many scenarios, networks emerge endogenously 
as cognitive agents establish links in order to exchange informa¬ 
tion. Network formation has been widely studied in economics, 
but only on the basis of simplistic models that assume that the 
value of each additional piece of information is constant. In this 
paper we present a first model and associated analysis for net¬ 
work formation under the much more realistic assumption that 
the value of each additional piece of information depends on the 
type of that piece of information and on the Information already 
possessed: information may be complementary or redundant. We 
model the formation of a network as a non-cooperative game in 
which the actions are the formation of links and the benefit of 
forming a link is the value of the information exchanged minus 
the cost of forming the link. We characterize the topologies 
of the networks emerging at a Nash equilibrium (NE) of this 
game and compare the efficiency of equilibrium networks with 
the efficiency of centrally designed networks. To quantify the 
impact of information redundancy and linking cost on social 
information loss, we provide estimates for the Price of Anarchy 
(PoA); to quantify the Impact on individual information loss we 
introduce and provide estimates for a measure we call Maximum 
Information Loss (MIL). Linally, we consider the setting in which 
agents are not endowed with information, but must produce it. 
We show that the validity of the well-known “law of the few” 
depends on how information aggregates; in particular, the “law 
of the few” falls when information displays complementarities. 

Index Terms —Cognitive networking, cognitive agents, infor¬ 
mation networks, network formation, self-organizing networks. 


I. Introduction 

T he widespread usage of mobile devices, together with 
the emergence of social-based services and applications, 
have inspired novel and self-organized networking paradigms 
that capitalize on the ability of mobile devices to connect 
and share information in an ad-hoc fashion. Contemporary 
networks, where users produce and exchange information, 
are “socio-technological” in nature; users do not necessarily 
exploit an exogenously designed network infrastructure, but 
rather form an endogenous network driven by the individual 
users’ quest for information. In this paper, we present a 
novel network formation model for information exchange 
over endogenously formed networks. Albeit being abstract, 
our model provides insights into understanding and designing 
many emerging and envisioned classes of applications. 

The authors are with the Department of Electrical Engineering, University 
of California Los Angeles (UCLA), Los Angeles, CA, 90095, USA (e-mail: 
ahmedmalaa@ucla.edu, ahujak@ucla.edu, mihaela@ee.ucla.edu). This work 
was funded by the Office of Naval Research (ONR). 


A. Motivation 

Many emerging networks are formed endogenously by self- 
interested agents, who take information sharing and production 
actions. Examples of such networks are: dynamic spectrum 
management by wireless users [1], social networks overlaid 
on technological networks [2] [3], device-to-device (D2D) 
communications, vehicular networks [5], Internet-of-Things 
(loT) [6], and smart sensor networks [7]. In many of these 
networks, users connect to each other in order to exchange 
and gather information. Eor instance, secondary users ex¬ 
change information about spectrum occupancy in cognitive 
radio networks [8] [9], autonomous rescue robots exchange 
environmental sensory information [6] [10], D2D users engage 
in short range communications in order to exchange data 
content of Social Networks Services (SNSs) [11], and self- 
interested users take capacity allocation decisions for multicast 
streaming over networks [12]. Users in such networks possess 
two key features: they are opportunistic, in the sense that 
they exploit their opportunistic encounter with other mobile 
users to establish short-range communication links with them, 
and they are cognitive, in the sense that they need to reason 
about establishing costly communication links with others 
given the value of information they can get via these links. 
Information in this context is an abstraction for any class 
of data that users gather and process, such as multi-modal 
content, geographical information, event-related information, 
cached content, behavioral data, and personal sensory infor¬ 
mation [13]-[15]. Eor instance, mobile users who coexist in 
close proximity can share information about traffic congestion 
and road accidents which helps them update their routes via 
applications such as Waze and Google maps, and D2D users 
can gather offloaded traffic of context-aware applications from 
other users by forming short-range communications links [11]. 
Moreover, information can also be produced by the agents 
themselves in the form of user-generated content, such as the 
upload and creation of blogs, videos and photos on online 
social networks (OSN), the purchase of content from service 
providers in peer-to-peer networks, updating traffic informa¬ 
tion via an application such as Waze, etc. Thus, users in such 
networks can jointly decide how much information should they 
produce, and how much information should they opportunisti¬ 
cally acquire from other users. As it is whenever users are self- 
interested, a game-theoretic framework is naturally deployed 
to study which networks will emerge at equilibrium and what 
are their characteristics. Network formation has been studied 
in the economics, electrical engineering and computer science 
literature. In the following subsection, we briefly review these 
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related works on endogenous network formation. 

B. Related Works 

Strategic network formation was first studied in the eco¬ 
nomics literature. Some of this literature [16]-[21] asks which 
networks are stable (according to some criteria) and hence 
more likely to persist and be observed. A (smaller) literature 
asks which networks emerge as the result of some specific 
dynamic process [22] [23]. In all these works, simplistic 
benefit functions are used: the value of each additional “good” 
exchanged is constant [16]-[19]. However, in realistic settings, 
information possessed by different agents can be redundant 
or complementary. For instance, secondary users in a multi¬ 
band cognitive radio system may be interested in gathering 
information about spectrum occupancy for bands that they do 
not sense by communicating with other users who do sense 
these bands [8]; sensors deployed over a correlated random 
field [24]-[28] may be interested in gathering complementary 
measurements about some set of physical processes of interest; 
and mobile users who exchange offloaded traffic of SNSs 
and context-aware applications are only interested in gathering 
non-redundant traffic and data updates. 

C. Summary of contributions 

This paper introduces a new model for strategic network 
formation where autonomous cognitive agents exchange valu¬ 
able information. We refer to such networks as cognitive 
information networks (CIN); networks in which agents self- 
organize to gather/exchange and produce information about a 
state of the world. This state of the world can be spectrum 
occupancy information and primary user activity in a multi¬ 
band cognitive radio system, location information provided by 
anchors of wireless networks, a set of messages sent by infor¬ 
mation sources in a multicast network, or blogs, videos, and 
data exchanged by users of social-physical networks. Agents 
are cognitive since they perceive information possessed by 
other agents, reason about which links to establish, how much 
information to produce, and then take information production 
and link formation decisions which result in an endogenously- 
formed network topology. We assume that agents in a CIN 
possess different amounts of information, benefit only from 
gathering non-redundant information, and they form links with 
each other in order to gather information and maximize their 
knowledge of the state of the world. 

Since the information possessed by different agents may 
be correlated (redundant), and link formation is costly, agents 
should cognitively select which agents to link with. We for¬ 
mulate this problem as a non-cooperative network formation 
game. Using information-theoretic measures for the value of 
the information possessed by each agent, we aim at char¬ 
acterizing the emerging stable network topologies at Nash 
Equilibrium (NE). Throughout our analysis, we focus on two 
classes of linking cost scenarios: homogeneous link formation 
cost and heterogeneous link formation costs. In the former, 
connecting to any agent entails the same cost, while in the 
later, the link cost is recipient-dependent. The link cost can 
correspond to tokens [29] [30], or an abstraction for any 


monetary, energy, or delay costs incurred by the agent forming 
the link. An agent in the network is an abstraction for a mobile 
user, a mobile device, or a transmitter/receiver that is rational 
and self-interested. 

We show that the networks that emerge at equilibrium are 
minimally connected; thus, agents tend to minimize the overall 
cost of constructing the network. With homogeneous link 
costs, equilibrium leads to a network in which each component 
is a star. Moreover, we show how information redundancy 
affects the link cost ranges at which the network becomes 
connected or disconnected, in addition to its impact on the 
network efficiency by quantifying the Price-of-Anarchy (PoA). 
Eor instance, we show that for networks with low link costs, 
when the link costs are homogeneous, all emerging networks 
are efficient; in contrast, information redundancy can induce 
costly anarchy in networks with heterogeneous link costs. 

Einally, we consider a setting in which each agent will 
not only decide which links to form, but also the amount 
of information to produce and we provide a characterization 
for the emerging NE. We show that when the number of 
agents is large, the fraction of agents producing information 
at equilibrium depends on the amount of redundancy in 
the agents’ information. When the agents produce strongly 
correlated information, the fraction of information producers 
is small and tends to zero as the number of agents tends to 
infinity: most agents get the information they need from a 
small set of agents. On the other hand, when agents have 
uncorrelated information, the number of information producers 
can grow at the same rate of total number of agents. Thus, 
such networks violate what Galeotti and Goyal [31] call the 
“law of the few”. In addition, we quantify the total amount 
of information produced in an asymptotically large network 
and identify scenarios in which the amount of information 
produced at equilibrium grows with the number of agents. 

This paper introduces a new model for cognitive agents ex¬ 
changing information/knowledge and studying what networks 
emerge endogenously as a result of self-organizing cognitive 
agents. Since many applications can use the presented model, 
we do not delve on the idiosyncratic details of specific 
applications. The rest of the paper is organized as follows. In 
Section II, we formalize the network formation game among 
agents in a CIN. Section III characterizes the emerging stable 
networks when the link formation costs are homogeneous, and 
the efficiency of such networks are investigated. Section IV 
analyzes the network topology and equilibrium efficiency for 
the case of heterogeneous link costs. The joint information 
production and link formation game is studied in Section V. 
Suggested future extensions for our model are provided in 
Section VI. Einally, conclusions are drawn in Section VII. 

II. Basic Model 

In this section, we discuss the problem setting and propose 
a basic model to formulate the endogenous network formation 
game emerging among cognitive agents. 

A. Information model 

Let ff = {1, 2, 3,..., W} be the set of agents in the CIN. 
Each agent i possesses exogenous information in the form of a 
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discrete random variable Xi and aims to form links with other 
agents to maximize its utility, which is defined as the benefit 
from the total information it possesses minus the linking cost. 
The formation of links is costly; thus, an agent has to trade 
off the benefits of the information it obtains from another 
agent versus the cost it needs to pay for connecting with that 
agent. The amount of information in Xi is quantified by the 
entropy function H{Xi). In addition, the random variables of 
all agents may be correlated, which indicates that some agents 
may possess similar information that is redundant to that of 
the other agents. The common information between agent i 
and j is captured by the mutual information I{Xi;Xj). 

The information possessed by the set of agents A/" is 
captured by an entropic vector that we define as follows. 

Definition 1: Entropic vector- a vector it is said to be an 
entropic vector of order N if there exists a random variable 
tuple {Xi,X 2 t..,Xm), where associated with any subset V 
of M, there is a joint entropy H{Xy) that is an element of 
it, where Xy = {X,\i e V} [35]. ■ 

The elements of it represent the joint entropies between all 
possible subsets of random variables possessed by agents in 
J\f. The set of all entropic vector constitute the entropic region 
which we define as follows. 

Definition 2: Entropic region- the entropic region T^ C 
“ is the set of all entropic vectors of order N, i.e. the 
set of all possible entropic vectors that can correspond to the 
information possessed by N agents. Thus, if a vector ll is 
entropic, then it G Tfi [35]. ■ 

We denote by TL the set of entropic vectors having 
H{Xi,X 2 ,...,XN) = where U C T%. The 

set of entropic vectors in H is simply a hyperplane in T^ 
that correspond to all entropic vectors with no information 
redundancies, which captures the aggregation models in [16] 
[17] [21]. 

The entropic vector can be constructed as follows. Given 
the set of agents fif and a corresponding set of random 
variables X = {Xi, X 2 , ■■■, Xn}, we construct the set 

V = V{X)/{(1)}, where V{X) is the power set of X. If 

V = {ui, U 2 ,..., U|v|}, then the entropic vector is given by 

if = where |V| = 2^-1, and H{X^f) 

is the joint entropy between all random variables in the 
set Vi. For instance, if we have 3 agents in the network, 
then V = {{!}, {2}, {3}, {1,2}, 12, 3}, {1,3}, {1,2, 3}}, 

and the entropic vector H is given by 
(i7(Xl), H{X2), iJ(X3), i7(Xi,2), i7(Xi,3), i7(X2,3), 
HiXi^ 2 , 3 )f, where H{Xi^ 2 ) = H{X^X 2 ). We denote a 
single element in the entropic vector as H(u) = H{Xy). The 
mutual information between the random variables possessed 
by any two subsets W and 14 of agents is given by [36] 

I{Xw;Xu) = HiXw) + H{Xu) - H{Xw,Xu). 


The total amount of information in the network is given by 
the joint entropy of the random variables of individual agents 
HiX) = HiXi,X2,X3,...,XN), where H{X) G it. 

The mutual information between any two agents i and j is 
given by I{X,; Xj) = H{Xi) - H{X,\Xj), where H{X,\Xj) 
is the conditional entropy which represents the additional 


information attained by agent j from connecting to i, i.e. 
the amount of extra information that j gets when getting 
the information of i. If this benefit is low, it means that 
I{Xi\Xj) is high, i.e. Xi and Xj are highly correlated, and 
vice versa. Note that mutual information is symmetric, i.e. 
I{Xp,Xj) = H{Xf) - H{X,\X,) = H{Xj) - HiXfX,). 
Finally we quantify the total amount of redundant informa¬ 
tion in the network. Let p(<T) = p{Xi, X2, ■■■, Xn) and 
q(T’) = nffiP^Xi), where p{Xi) is the pmf of Xi. The 
Kullback Leibler (KL) divergence for these distributions can 
be computed as follows [36] 

^ (p||q) = ^ p (■^) log 

N 

= Y,H{X,)-H{Xi,X2,...,Xn). (1) 

i=l 

The KL divergence is a natural metric for quantifying the 
distance between probability measures, and it can be obtained 
in terms of the entropy as shown in (1). In particular, the 
KL divergence of q (T”) from p (<T) is equal to the differ¬ 
ence between the amount of information possessed jointly 
by the agents, and the corresponding amount of information 
possessed by the same agents if such information has no 
redundancies. Throughout the paper, we use H(X-i) to denote 
H{X/{Xi}), and KL(T’) = Il(p||q) to denote the KL 
divergence. 


B. Network formation game 

Agents benefit from gathering information by linking to 
other agents. The link formation strategy adopted by agent 
i is denoted by a tuple gi = {gij)je{i,...,N}/{i} G {0,1}^“^ 
gij = 1 if agent i forms a link with agent j and pij = 
0 otherwise. We assume unilateral link formation where an 
agent decides to form a link and solely bears the cost of link 
formation'. A strategy profile g is defined as the collection 
of strategies of all agents, i.e. g {gi)iLi G G, where 
G is a finite space. When agent i forms a link with agent 
j, it incurs a cost of dj. We define the topology of the 
network as T = {(*,j) G A/" x Af\max{gij, gji} = 1}. 
All connected agents exchange information bilaterally; thus T 
is an undirected graph. Information is shared between agents 
that are indirectly connected and agents do not benefit from 
receiving multiple versions of the same information from the 
same agent. Such model is suitable for networks with multi¬ 
hop relaying where information is forwarded from one node 
to another [37]. We write i —)■ j to indicate that agent j 
is reachable by agent i either directly or indirectly. Define 
the set of agents that i form links with (set of neighbors) as 
X'iig) = {j\gij = 1}, and the set of agents reachable by agent 
i as TZi{g) = {j\i —t j}. Throughout the paper, we adopt the 
following definitions. 

Definition 3: Network component- a component C is a set 
of agents such that i —)■ j,Vi,j G C, and i j,Vi € C and 

’ Other link formation models, such as link formation with bilateral consent, 
can be used with an appropriate solution concept such as pairwise stability 
as we discuss in Section VII. 
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j ^ C, i.e. two agents in two different components cannot 
share information. ■ 

Definition 4: Minimally connected component- a compo¬ 
nent is minimally connected if each agent z € C is connected 
to each agent j € C via a unique path. ■ 

Agents in a component share the information they possess and 
consequently attain “informational” benefits that are captured 
via a utility function. The utility function of agent z is given 
by 

Ms) = f (HiXiumis))) - 

iGA/'i(g) 

where the function /(.) represents the benefit of agent z from 
the information it gathers. We assume that the agents benefit 
from acquiring information increases, while the marginal ben¬ 
efit decreases, with the increase of the amount of information 
gathered. That is, in a sensor network setting, the benefit 
of a sensor node from collecting information saturates if 
it is connected to a large number of sensors; thus, /(.) is 
assumed to be twice continuously differentiable, increasing, 
and concave with /(O) = 0. Note that the total information 
acquired by z in ( 2 ) can be written in terms of the conditional 
entropies based on the chain rule as [36] 

l^dg)l 

= H{Xi) + ^ H{X,,\Xi, 
k^l 

where TZi{g) = {ji,j 2 , j| 7 ?,i(g)|}, which implies that agents 

benefit by acquiring new information conditioned on its own 
information and the information it acquires from other connec¬ 
tions. Moreover, the aggregate information can be expressed 
in terms of the mutual information as 

HiX^unuM = H{X,) + H{Xn,(M " 

where the term H{X-ji.i^M represents the net information that 
agent i acquires after connecting to the agents in JVfig), where 
the term I{Xi; captures the redundancy between the 

information of agent z and the information it acquires from the 
set TZi{g). Let u = {ui,U2, ...jUn)- Throughout the paper, we 
denote the network formation game by G, u, H). We 

assume a complete information scenario, where all agents have 
knowledge of the entropic vector it, the strategy space G and 
the utilities of all agents u. 

C. Stability concept and network efficiency 

The link formation game is formulated as a non-cooperative 
simultaneous move game and we focus on the Nash Equi¬ 
librium (NE) as the solution concept. The NE is defined as 
follows 

Ms*,s~i) >Ms^,S-i)ys^ e {o, i}^"\Vz efif, (3) 

where g* is the NE strategy of agent z, and gl^ is the 
NE strategy profile of all users other than i. A strict NE is 
obtained by making the inequality in (3) strict. The game 
can have multiple NE defined as G* = {g*| Vzti(g*,glJ > 
j gli))€ {0; 1}^“^}. In the following Theorem, we 
show that there exists at least one network satisfying the NE 
conditions, i.e. G* 7 ^ </>. 


Theorem 1: (The Existence of Nash Equilibrium) A pure 
strategy NE always exists for = (TV", G, u, if). 

Proof See Appendix A. ■ 

The social welfare of the network formation game is defined 
as the sum of agents’ individual utilities. Eor a strategy profile 
g, the social welfare is defined as 

Uis)-^^Ms)- ( 4 ) 

ieN 

A strategy profile g is called socially optimal if it maximizes 
the social welfare (achieves the social optimum {/), i.e. 

i/:=C/(g)>C/(g),VgeG. (5) 

When there are multiple equilibria, we use two metrics to 
assess the equilibrium efficiency. Eirst, we adopt the Price of 
Anarchy (PoA) to quantify the impact of the agents’ selfish 
behavior on the social welfare. The PoA is defined as the 
ratio between the social optimum and the lowest social welfare 
achieved at equilibrium, i.e. 


mmg.gG* C^(g*) 

In addition, we analyze the impact of the agents selfish 
behavior on the information gathering process by defining a 
novel metric that we term the Maximum Information Loss 
(MIL). The MIL is defined as the maximum difference be¬ 
tween the amount of information gathered by any agent at two 
different equilibria as shown in (7). Unlike the PoA, the MIL 
quantifies the maximum information loss without considering 
the link cost. In addition, while the PoA considers the welfare 
of all agents, the MIL quantifies the highest information loss 
incurred by an agent in the worst case. 

III. Nash Equilibrium Analysis for Homogeneous 
Link Costs 

In this section, we assume that the cost of forming a link 
between any two agents z and j is given by Cij = c, Vi, j € M. 
The goal of this section is to answer the following question: 
given an entropic vector it, what are the network topologies 1~ 
that can emerge at an NE of the game when the link costs 
are homogeneous? We start with the following motivating 
example to identify different factors that affect the equilibria 
of g\ 

A. Motivating example for two-agents interaction: does infor¬ 
mation redundancy matter? 

Consider a simple network with only two agents (TV = 2) 
possessing random variables Xi and X 2 . We aim at charac¬ 
terizing the equilibria of G"^ = ({1, 2}, G, u, it). The strategy 
of agent 1 is simply a linking decision gi 2 C {0,1}, while 
for agent 2, the strategy is p 2 i C {0,1}. We write G^ in 
normal form in Table I, where the row player is agent 2 
and the column player is agent 1. Each cell displays the 
utilities of agents 1 and 2 respectively. Assume that the 
link cost is the same for both agents and equal to c. It 
can be easily shown that the payoffs of agent 1 are given 
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MIL = max ( sup U X 7 ^^(g.)) — in^ i/(Xi U 


(7) 


by ui{gi 2 = 1,521 = 1 ) = 111(512 = 1,521 = 0 ) = 
f{HiXi,X 2 ))-c, 111(512 = 0,521 = 1 ) = fiH{Xi,X 2 )), 
and 111(512 = 0,521 = 0) = / {H{Xi)). 

TABLE I: Two agent network formation game in normal form 


gi2 = 1 _ gi2 = 0 


«l( 9 l 2 = 1,921 = 1 ), 
“2(912 = 1,921 = 1 ) 

^1(512 = 0,5121 = 

U 2 {gi 2 = 0, 5121 = 1) 

“1(912 = 1,921 = 0 ), 
“2(912 = 1,921 = 1) 

“1(912 = 0,921 = 0 ), 
“2(912 = 0,921 = 0 ) 


Fig. 1 depicts the entropic region F^ of the two random 
variables Xi and X 2 . The entropic region F^ can be eas¬ 
ily constructed by applying the three Shannon inequalities 
H{Xi) < H{Xi,X 2 ), H{X 2 ) < H{Xi,X 2 ), and H{Xi) + 
H{X 2 ) > H{Xi,X 2 ). The intersection of these three hyper¬ 
planes in results in the polyhedral cone depicted in Fig. 1. 
The distance between an entropic vector (depicted by a thick 
dot inside F^) and the corresponding entropic vector on TL the 
light-colored hyperplane) with the same H{Xi) and H{X 2 ) 
is equal to the KL divergence. If KL{Xi, X 2 ) = 0, then the 
entropic vector lies on H, and the 2 agents have non-redundant 
information. 



Fig. 1 : The entropic region T^ for 2 random variables. 

The equilibria of this game depend on both the link cost 
and the entropic vector, which corresponds to the amount of 
information redundancy. For an arbitrary entropic vector, the 
game has two possible equilibria g* = (512 = 1,521 = 0 ) 
and g* = (512 = 0,521 = 1 ) if c < fiHiXi,X 2 )) - 
f {ma.x{H{Xi), H{X2)}). Assume that H{Xi) > H{X2). 
Therefore, the network has a unique equilibrium g* = (512 = 
0,521 = 1 ) when f{H{Xi,X 2 )) - f{H{Xi)) < c < 
/(iT(Xi,X 2 )) — f{H{X 2 )), and a unique equilibrium g* = 
(512 = 0,521 = 0 ) when c > f{H{Xi,X 2 )) - fiH{X 2 )). 
On the other hand, if we fix the link cost and the entropies 
H{Xi) and H{X2), we observe that the equilibria change by 
changing the KL divergence. For instance, the network has 
two equilibria g* = (512 = 1,521 = 0 ) and g* = (512 = 
0,521 = 1) when c < f{H{Xi) + H{X2) - KL(T’)) - 
/ (max{iT(2fi), iJ( 2 f 2 )}). Thus, as the entropic vector be¬ 
comes closer to the hyperplane %, i.e. KL(T’) decreases, the 


cost threshold for which these two equilibria emerge increases. 
This means that the characterization of the NE is sensitive to 
the amount of information redundancy KL{X), even if we 
fix the individual entropies H{Xi) and H{X 2 ). Note that the 
strategy profile g = (512 = 1,521 = 1 ) never emerges as an 
NE since under such profile any of the two agents can break 
the link formed and get a strictly higher utility. 

B. Characterization of the NE for 

In this subsection, we present a generic characterization for 
the NE of g^. 

Proposition 1: (Network minimality) In every NE, all net¬ 
work components are minimally connected. 

Proof See Appendix B. ■ 

Proposition 1 implies that agents in each component will 
form the minimal number of links possible to gather the 
maximum amount of information. This results from indirect 
information sharing within each network component, i.e. if 
there exists a path to an agent then there is no extra benefit 
in making a direct link to that agent since all the information 
from that agent is already accessible. 

Next, we characterize the connectivity of the network as a 
function of the link cost in the following Lemma. 

Lemma 1: (Network connectivity regions) 

(i) If c < Q, with Q = / - /(mini iT(X_i)), then, 

at every NE (a) the network is minimally connected 
(the network has one component) and (b) the amount 
of information possessed by each agent is H{X) (all 
information is shared). 

(ii) If c > c„, where Cu = f {H(^X)) — /(mini H{Xi)), then 
there is a unique NE which is strict. At this equilibrium, 
the network is fully disconnected and the amount of 
information possessed by each agent i is H{Xi) (no 
information is shared). 

Proof See Appendix C. ■ 

Erom the above Lemma, we can see that three factors affect 
the connectivity of a network: the link cost, the amount of 
information possessed by each agent, and the redundancies 
among the agents’ information. Based on the result of Lemma 
1, we dehne three regions for the connectivity of the NE 
networks based on the link cost as follows: 

• Connected agents region (ICc)' A network with an en¬ 
tropic vector it has a single component when the link 
cost is c < Cl. 

• Isolated agents region (ICi): The network has N compo¬ 
nents when the link cost is c > Cu- 

• Mixed region (/Cm): Depending on the entropic vector, 
the network can have different number of components 
ranging from 1 to when the link cost is ci < c < Cu- 
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While the connectivity regions describe the impact of link 
cost on network topology, they also have informational sig¬ 
nificance. For instance, the amount of information possessed 
by any agent in the ICc region is H{X), while in the ICi 
region, no agent i gathers any extra information other than its 
own intrinsic information H{Xi). On the other hand, agents 
in the /Cm region can end up gathering different amounts 
of information as there are potentially multiple equilibria 
with different topologies and connectedness. In the following 
illustrative example, we demonstrate the impact of the link 
cost and information redundancy on the network’s connectivity 
regions. 

Illustrative example 1: To illustrate the impact of in¬ 
formation redundancy and link cost on the NE networks’ 
connectivity, we plot the /Cm, /Cc, /C/ regions in the 
link cost-information redundancy plane for 2 different fam¬ 
ilies of entropic vectors. Assume that we have a 3-agent 
CIN, with H{Xi) > H{X2), and H{X2) = H{X^), and 
that agent 1 has non-redundant information, i.e. the random 
variable Xi is independent on X2 and ^ 3 . Thus, we have 
KL(A’) = I{X 2 ]X^). We consider two different families 
of entropic vectors (i.e. two different assignments for the 
values of individual agents’ entropies), the first is given by 
{H^{Xi) = b,H^{X 2 ) = 4 ,iFi(W 3 ) = 4), whereas the 
second is given by (iF^(Wi) = 7,H‘^{X2) = A,H‘^{X^) = 
2). The connectivity regions associated with entropic vector 
family i is denoted by (/C^,/C},/C 5 ^^). An exemplary utility 
function of f{x) = log(l -t- x) is used. In Fig. 2, we plot the 
connectivity regions in the cost-KL divergence plane for the 

2 families of entropic vectors. For both families of entropic 
vectors, the /Cm region shrinks as the information redundancy 
increases. That is, when agents share more information in com¬ 
mon, the NE network connectivity becomes less “uncertain” 
since the /Cm region (which is the only region with potentially 
multiple equilibria with different levels of connectivity) in 
this case will correspond to a limited range of link costs. 
Moreover, we note that for the first family of entropic vectors, 
when agents 2 and 3 information are fully redundant (i.e. 
KL(A’) = 4), we have a sharp threshold on the link cost, 
below which we have a connected network, and above which 
we have a fully disconnected network (i.e. the /Cm region is 
empty). The intuition behind this is that since agents 2 and 

3 are fully “correlated”, they only benefit from connecting to 
agent 1. Thus, agent 1 acts as the only information source, and 
it is the benefit from getting agent’s 1 information that solely 
determines the cost at which the network would be connected 
or not. If agents 2 and 3 information are not redundant, they 
add value to the network, and the cost thresholds become de¬ 
pendent on their information as well. However, for the second 
family of entropic vector, since there is more heterogeneity in 
the amount of information possessed by the agents, no single 
agents monopolizes the information at any value of the KL 
divergence, thus the /Cm region does not vanish for the second 
vector for any value of KL(A’). ■ 

While Lemma 1 focuses on the impact of link cost on the 
connectivity of the network, it does not provide a complete 
characterization for an NE network. In the next Theorem, we 
give the necessary and sufficient conditions for the emergence 
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Fig. 2 : Impact of link cost and information redundancy on the 
network’s connectivity. 

of an arbitrary CIN topology in NE. 

Theorem 2: A network in which the components are pre¬ 
cisely {Ci,C 2 , .■.,Ck} can be supported in a NE if and only 
if the following relationships between the cost and the value 
of information are satisfied 

1) f{H{Xc,)) - mm{f{H{Xc,n,y))J{H{X,))} > c, 

2 ) f{H{Xc,ucJ) - f{H{XcJ) < c, Vi,j e { 1 , 2 ,..., K}. 
Proof See Appendix D. ■ 

Erom Theorem 2 we know that, at NE, the network is gen¬ 
erally composed of multiple components and each component 
is minimally connected. Each component possesses a set of 
random variables that are jointly highly correlated to the joint 
random variables possessed by other components. Condition 
(1) in Theorem 2 implies that each agent in a component 
either benefits from forming a link to some other agent in 
that component, or other agents benefit from linking to it, 
while condition (2) implies that agents in different components 
have no incentives to connect to agents in other components. 
Note that due to indirect information sharing, many equilibria 
can exist with highly variant topologies. In the subsequent 
Theorem, we refine the equilibrium notion used, and we 
determine the specific topologies emerging in a strict NE. 

Theorem 3: A network is a strict NE if and only if the 
following conditions are simultaneously satisfied 

• All conditions stated in Theorem 2 are satisfied. 

• Eor each component C of size M > 1, there exists a set 

C C with \C\ ^ M — 1 such that 

c={j\ f{H{Xc)) - /{HiXcfi,})) > c}. 

• Each non-singleton component forms a core-sponsored 
star topology, where the periphery agents belong to the 
set (■. 

Proof See Appendix E. ■ 

This Theorem states that for homogeneous link formation 
costs, each network component of size M comprises a single 
agent bearing the cost of getting connected to M — 1 other 
agents. Such networks exhibit a core-periphery structure, i.e. a 
single agent at the core is connected to a set of M—1 periphery 
agents. The conditions in Theorem 3 state that the periphery 
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c< f(H(X)) -f (min, H(X_i)) 



c>f(H(X))-f(mm,H(X.)) 



f{H(X)) -f (min, H(X_i))<c<f(H{X)) -/(min, H{X,)) 


Fig. 3: The exemplary strict NE topologies for various link cost 
ranges. 


agents must be high entropy agents. This is because the benefit 
obtained by connecting to a periphery agent j at equilibrium 
must exceed the cost, i.e. f{H{Xc)) — f{H{Xc/{j})) > c. 
The intuition behind this condition is as follows. For an 
agent to be a periphery agent, it must have both high entropy 
and low redundancy with the information possessed by other 
component members such that core agents have an incentive 
to form a link with it. Fig. 3 depicts an exemplary topology 
of a CIN at strict NE for various link formation cost ranges. 

In the next subsection, we study the efficiency of the 
NE networks and compare the self-organized CINs to those 
designed by a network planner. 


information redundancy reduces the PoA in the K-m region^. 
While the social welfare captures the sum utilities, it does not 
quantify the individual losses by agents. In the next corollary, 
we quantify the MIL for different connectivity regions. 

Corollary 1: Eor a CIN with homogeneous link cost, the 
MIL satisfies 

MIL = 0, e /CcU/C/, 

and 

MIL < H{X) - uXaHiXi), V e /Cm- 

Proof See Appendix G. ■ 

Lig. X depicts the PoA for a 3-agent CIN with the first fam¬ 
ily of entropic vectors defined in illustrative example 1. It can 
be seen that the PoA is greater than 1 only in the /Cm region. 
In addition, the PoA decreases as the KL divergence increases, 
since the value of information in the network decreases, 
which means that the best equilibrium (connected network) 
achieves a smaller social welfare while the welfare of the 
worst equilibrium (fully disconnected network) is independent 
of the KL divergence. The PoA also decreases as the link cost 
increases. Prom Pig. X, we can see that when KL{X) = 4, the 
network exhibit an empty /Cm region, i.e. the network changes 
from a connected to a fully disconnected network if the cost 
exceeds a certain threshold. Thus, for KL(A’) = 4 the network 
is robust to efficiency loss for all values of link cost as the /Cm 
region is the only region where efficiency loss can occur. Pig. 
X depicts the MIL upper bound for the same network. It is also 
observed that the MIL upper bound decreases monotonically 
with the increasing information redundancy. 


C. Equilibrium efficiency analysis 

The goal of this subsection is to investigate the equilibrium 
efficiency of with homogeneous link costs by quantifying 
the PoA and the MIL. We start by quantifying the PoA of 
CINs in the following Lemma. 

Lemma 2: Lor a CIN with homogeneous link costs, the 
Price-of-Anarchy satisfies 

PoA = l,v(if,c) e/CcU/C/, 


and 


PoA < 


Nf{H{X)) 


G/Cm- 


Proof See Appendix P. ■ 

This Lemma shows that all NE networks in the ICc and 
K-i regions are socially optimal. While in the ICc region 
multiple equilibria exist, they all have the same social welfare 
of Nf{H{X)) — {N — l)c. However, in the /Cm region 
the NE networks may not be socially optimal, and we give 
an upper bound on the PoA. When all agents possess non- 
redundant information, the PoA is upper bounded by N, 
whereas when agents possess redundant information, we have 
PoA < < N, which gives an indication that 


IV. Nash Equilibrium Analysis for Heterogeneous 
Link Costs 

In this section, we extend the analysis done in the previous 
section for the game , but assuming that the cost of link 
formation is exclusively recipient-dependent, i.e. Cy = Cj, Vz. 
It is easy to show that Proposition 1 applies to the case of 
heterogeneous link costs, i.e. all network components that 
satisfy the NE conditions are minimally connected. 

A. Characterization of the NE for 

The following proposition relates the link costs to the 
connectivity of the NE networks. 

Proposition 2: 

(i) If a < f{H{X)) — f{H{X-i)),Wi G Nf, then, at every 
NE (a) the network is minimally connected (the network 
has one component) and (b) the amount of information 
possessed by each agent is H{X) (all information is 
shared). 

(ii) If f{H{X)) - fiminj H{X_j)) < minfcg^/{,} Cfc, 
where i = argmin^ H(X-j), then there is a unique NE 
which is strict. At this equilibrium, the network is fully 

^This is intuitive since when information redundancy increases, the socially 
optimal welfare decreases, while the welfare of a disconnected network is 
fixed, which means that the PoA decreases. 







disconnected and the amount of information possessed 
by each agent i is H{Xi) (no information is shared). 

Proof This can be proven straightforwardly using the same 
arguments in the proof of Lemma 1. ■ 

This proposition shows that the network topology is highly 
dependent on the heterogeneity of the agents as it depends both 
on the heterogeneous costs and heterogeneous information of 
agents. Also the case when all NE networks are connected cor¬ 
responds to the JCc region in the homogeneous cost scenario, 
while the case when the NE is a fully disconnected network 
corresponds to the ICj region. An appropriate definition for the 
connectivity regions for the heterogeneous cost case is given 
by (9), (10), and (11). 

In the following Theorem, we give a generic characteriza¬ 
tion for this class of networks in NE. 

Theorem 4: A network in which the components are pre¬ 
cisely {Ci,C 2 , .■.,Ck} can be supported in a NE if and only 
if the following relationships between the cost and the value 
of information are satisfied 

1) f{H{Xc,uCj)) - f{H{Xci)) > minfcgCj Cfc, Vi, j e 

2) f{H{Xc,)) > min{/(//(Xc,/{,})) +c„/(i/(X,)) + 

minfcgCi/L} Vi € {1,2, € Ci. 

Proof This can be proven following the same idea for the 
proof of Theorem 2. ■ 

Note that unlike the homogeneous cost scenario, we cannot 
characterize and plot the connectivity versus a single value 
for link cost since the link cost is now a multidimensional 
parameter. In the next subsection, we analyze the efficiency 
of the NE networks. 


B. Equilibrium Efficiency Analysis 

In this subsection, we quantify the impact of the link costs 
heterogeneity on the network efficiency. Unlike the case of the 
homogeneous link costs, we show that information redundancy 
induces costly anarchy in the ICc region when the link costs 
are recipient-dependent. In the following Lemma, we quantify 
the PoA for the JCc and JCi regions. 

Lemma 3: Eor a CIN with heterogeneous link costs, the PoA 
satisfies 

_ J 1, :V(t,c) €lCi 

, :V(i,c)G^C 


and 


NfjHjX)) 

EhfiHix.)) 


:V(t,c) G/Cm. 


Proof See Appendix H. ■ 


Thus, unlike in the homogeneous cost scenario, not all NE 
networks in the JCc region are socially optimal. In fact, any 
NE network other than a periphery-sponsored star with the 
agent having the lowest link cost residing in the core, is not 
socially optimal. How does information redundancy affect the 
PoA in such networks? The following Theorem answers this 
question. 


Theorem 5: Eor a CIN with recipient-dependent link costs 
in the JCc region and for fixed values of the individual agents’ 
entropies, the Price-of-Anarchy is a monotonically increasing 
function of the total information redundancy. 

Proof See Appendix 1. ■ 

Thus, in stark contrast with the results obtained for the 
homogeneous cost CINs, Theorem 5 states that information 
redundancy induces costly anarchy for a network in JCc re¬ 
gion. This results from the heterogeneity of the link formation 
costs, which promotes anarchy in the network as agents are no 
longer indifferent to the links they form as in the homogeneous 
cost scenario. As a matter of fact, some agents may end 
up forming “expensive” links and getting the same amount 
of information that they could have gathered by forming a 
“cheaper” link. When information redundancy increases, the 
value of the information gathered by agents decreases, thus, 
anarchy costs more and the PoA increases. Contrarily, in 
the JCm region, the upper bound on PoA decreases as the 
information redundancy increases in a similar manner to the 
homogenous link costs scenario. Unlike the PoA, the MIL 
upper bound is not sensitive to cost heterogeneity since it is 
only sensitive to informational losses. It can be easily shown 
that the MIL in recipient-dependent CINs behaves in the same 
way as in the homogeneous cost scenario. In the next section, 
we tackle the problem of joint information production and link 
formation in CINs. 

V. Joint Information Production and Link 
Eormation Games in CINs 

In the network formation game so far, we have assumed that 
agents in a CIN are gifted with an exogenously determined 
entropic vector. Nevertheless, in many practical CINs, agents 
decide the amount of information to “produce” given some 
production cost, e.g. mobile users in cellular systems may 
download data for social-based services by themselves via the 
cellular network infrastructure, or get this data opportunisti¬ 
cally from other users by establishing D2D links [11]. In this 
section, we focus on a CIN where each agent jointly decides 
the amount of information to produce and the links to form. 

A. Game formulation 

When agents choose what information to produce, a crucial 
aspect that affects the network topology and information 
production is how information aggregates. [31] assumes that 
information aggregates simply by addition; this will be the 
case only if the value of each additional piece of information 
is constant; thus, there are no complementarities nor redun¬ 
dancies. [20] assumes a specific functional form, the Dixit- 
Stiglitz function; this captures informational complementarities 
and redundancies in a very special way, i.e. agents appreciate 
“diversity of information sources” rather than the “diversity 
of the information”. In this paper, we consider two modes of 
aggregation that seem more natural and are suggested by the 
formulation of information in terms of entropy. 

The information production decision taken by N agents 
in a CIN corresponds to the selection of a point inside the 
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/Cc = j(i,c=( 


= [ci,C2, Cat] 


f)^ c G e r^, andCi </(i/(A:')) -/(iJ(X_i)),i = argmiiiiJ(X_j) I . (9) 

K-I = < (lt,c = (ci,C2, ■.■,cn)) c G it G r^, and f{H{X)) - f{mmH{X-j)) < min Ck,i = argmmH{X-j) I . 

I, ^ / I j keX/ii} 3 J 

/Cm = {(it,c= (ci,C2,...,cw)) |v(it,c) ^/CcU/Ci,cGR+,it GT^}. 


( 10 ) 

( 11 ) 


entropic region F^. Correlations between the random variables 
of different agents are exogenously determined by external 
factors, e.g. geographical locations of sensors. To capture 
information redundancy, we define an aggregation function 
Fu : —>• R, that maps the entropies of a set of agents 

to a joint entropy of these agents, i.e. H{Xi,X 2 , ■■■,Xn) = 
F-h {H{Xi), H{X 2 ), ■■■, H{X]\f)). Clearly, the range of the 
function F-u{.) should belong to F^. Throughout this sec¬ 
tion, we study two different aggregation functions: the first 
is the one corresponding to independent random variables 
FI{Xi, X 2 , ■■■, Xn) = 'Yl!i=iH{Xi), and the second is the 
one corresponding to strongly correlated random variables 
H{Xi,X2,...,Xm) = max{if(Xi),iF(X2),...,if(X^)}. 

Both aggregation functions provide insights on how informa¬ 
tion redundancy affects the information production decisions 
at equilibrium. 

In real-world networks, the aggregation function captures 
the informational relationships between different agents in a 
CIN. For instance, in a sensor network where sensors are 
deployed over a correlated random field [25], the information 
production decision can be thought of as the precision at 
which a sensor quantizes its measurements. Larger precision 
corresponds to larger value for the entropy. However, no matter 
what precision a sensor uses, its measurements will be corre¬ 
lated to that of another nearby sensor. Thus, the joint entropy 
of the two sensors would be governed not only by the precision 
they decide, but also by the redundancy in their information 
that is determined exogenously by their geographical locations 
and the nature of the physical process that they sense. The 
aggregation function captures such exogenous factors, and 
based on it, the behavior of cognitive agents is determined. 

In the information production and link formation game, 
the strategy of an agent i is denoted by = {H{Xi),gi). 
A strategy prohle of the game is written as s = 
(iF(2fi), 7J(2f2),..., iT(XAr), g), and the strategy space is S. 
We denote the joint information production and link formation 
game by = {J\f, S, u). Thus, different from Q^, agents do 
not observe an entropic vector, but they decide the entropic 
vector based on their knowledge of the aggregation function. 

The utility function of agent i is given by 

u,(s) = / (iJ(X,uK,(g))) - kH{X,) - |M(g)|c, (11) 

where k is the cost of producing one unit of information, 
|A/i (g) I is the number of agents which agent i form links with, 
and H{Xnj-ji.(^g')) is determined by F-u given the production 
levels of all agents. We adopt the NE as a solution concept. 
Thus, a strategy profile s* is an NE profile if no agent benefits 
from unilaterally forming a link, breaking a link, or altering the 


H{Xx,X2) = H{Xi)+H{X2) 



Fig. 4 : The aggregation function for independent random variables. 

H{Xi,X 2 ) = maiH{Xi),H{X 2 )} 



Fig. 5 : The aggregation function for strongly correlated random 
variables. 

amount of information it produces. The set of NE profiles is 
denoted by S*. Einally, we denote by H the maximum amount 
of information that each agent can produce at equilibrium, 
thus H can be obtained by solving / {H) = k [31]. In the 
following subsection, we revisit the motivating example of the 
two agents interaction in order to understand the cognitive 
behavior of agents in Q^. 

B. Motivating example for two-agents interaction: To produce 
or not to produce? 

Consider a simple CIN with only two agents {N = 2) 
who are playing the game Q^. We aim at characterizing the 
equilibria of = ({l,2},S,u), and investigate the impact 
of F-u, k, and c on the cognitive behavior of the agents. 
Specifically, we are interested in identifying scenarios in which 
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one agent may decide not to produce any information and fully 
rely on the other. Let us focus on agent 1. The utility function 
of this agent is given by 

Mi(s) = / (iJ(Xiu 7 ^i(g))) - kH{X^) - gi 2 C, 

where 7^i(g) = </> if gi 2 = <721 = 0, and 7^i(g) = 2 otherwise. 
The best response of agent 1 is given by 

Mi(s*) = max (/(iT(Xiu 7 ?,i(g))) - - 512 c) . 

gi2,H{Xi) 

Note that the decision of agent 1 depends on the value of 
i 7 (Jfiu 7 ?,i(g)), which is determined by F-u- For 2 agents, 
the entropic vector is it = [H{Xi),H{X 2 ),H{Xi,X 2 )]. 
The function F-u maps the information production de¬ 
cisions F[{Xi) and F[{X 2 ) to H{Xi,X 2 ). Thus, we 
have H{Xi,X 2 ) = Fn{H{Xi),H{X 2 )). In the fol¬ 

lowing, we focus on two different aggregation func¬ 
tions FniH{Xi),H{X 2 )) = H{Xi) + H{X 2 ) and 
F-h {H{Xi),HiX2)) = max{if(Xi),if(X 2 )}. 

F„(H(Xi),H(X 2 )) =H(Xi)+H(X 2 ).- In this 
case, the information of agents 1 and 2 are not redundant, 
which means that the random variables Xi and X 2 are 
independent. Thus, F-u maps the production profile of both 
agents to a point in the set TL. This reduces to the aggregation 
function used in [31]. Fig. 4 plots F-u, which corresponds 
to the upper surface of the convex cone F^ (or equivalently, 
the hyperplane "H). Assume that the link cost is given by 
c > kH. In this case, we have a unique equilibrium in which 
5*2 = P21 = 0 . and H*{Xi) = 77*(X2) = 77. Thus, we have 
a fully disconnected network with both agents producing 
information. This means that when the link cost is very 
high, every agent decides to produce information and not to 
get information from the other. Now assume that c < kH. 
It is easy to show that gl292i — 9i2 = 1 or g^i = 1, 

and H*{Xi) + 77*(X 2 ) = 77. Thus, when the link cost is 
low, agents generally produce some of the information they 
need and get some other information from the other agent. 
However, one possible equilibrium has one agent producing 
an amount 77 of information with the other forming a link 
with it and not producing any information on its own. 

2] F„ (H(Xi), H(X 2 )) = max{H(Xi), H(X 2 )}.- 
Agents may possess fully correlated information in which 
the joint entropy is always bounded by the entropy of one of 
them. Fig. 5 plots F-u which corresponds to the lower surface 
of the convex cone F^. In this case, it is never beneficial for 
any agent to form a link and produce a positive amount of 
information simultaneously. For c > kH, we have a unique 
equilibrium comprising a fully disconnected network with 
each agent producing H. For c < kH, we have only one 
agent producing positive amount of information in every 
equilibrium. 

Thus, information redundancy influences the agents’ infor¬ 
mation production decisions. When the information contains 
no redundancies, there exist many equilibria in which both 
agents produce positive amount of information when c < kH. 
However, for c < kH, when agents have strongly correlated 
information, every equilibrium has only one agent producing 


information. Thus, redundancy discourages information shar¬ 
ing between agents and reduces the number of agents produc¬ 
ing information when the link cost is low. When c > kH, we 
always have a disconnected network with all agents producing 
information for both aggregation functions. However, the total 
amount of information in the network when the random 
variables of both agents are independent is 77(Xi, X 2 ) = 2H, 
while when the information of both agents are fully corre¬ 
lated (i.e., H{Xi,X 2 ) = max{77(Xi), 77 (X 2 )}), we have 
H{Xi,X 2 ) = H. In the next subsection, we generalize these 
results to the game. 

C. Characterization of the NE for and asymptotic infor¬ 
mation production behavior 

In this subsection, we characterize the NE for 
the 0^ game. We study the equilibria for the two 
aggregation functions F^{H{Xi),H{X 2 ),...,H{Xk)) = 

and F^{H{Xi),H{X 2 ),...,H{Xk)) = 

max{77(Xi), H{X 2 ), ...,H{Xn)}. In the following Theorem, 
we obtain some properties of the equilibria of when the 
aggregation function is F^. 

Theorem 6: For the aggregation function F^ we have: 

(1) If c > kH, then there exists a unique equilibrium 
s* where the network is fully disconnected and every agent 
produces the individually optimal amount of information 
(77*(Xi) = 77). 

(2) If c < kH, then s* is an equilibrium if and only if: 
(i) the CIN is minimally connected, (ii) the total amount of 
information is H{X) = H, and (iii) if any agent i forms 
a link in the network {g*j = 1,7, j € M), then the cost of 
linking should be less than the cost of producing the amount 
of information obtained by forming a link c < kH*{X-i). 

Proof See Appendix J. ■ 

Condition (1) results from indirect information sharing among 
connected agents. In addition, the network has a total infor¬ 
mation of 77 since all agents perfectly share the information 
they produce, which results in condition (2). Finally, condition 

(3) says that the cost of linking should be less than the cost of 
producing the amount of information obtained via linking. In 
the following Theorem, we characterize the equilibrium when 
the aggregation function is F^. 

Theorem 7: For the aggregation function F^ we have: 

(1) If c > kH, then there exists a unique equilibrium s* 
where g*j = 0, and H*{Xi) = H,yi,j € N. 

(2) If c < kH, then s* is an equilibrium if and only if: (i) 
the CIN is minimally connected, (ii) there exists exactly one 
agent i with H*{Xi) = H, and H*{X-i) = 0, (iii) all agents 
with zero information production form exactly one link. 

Proof See Appendix K. ■ 

Theorem 7 states that when agents’ information is strongly 
correlated, information production is monopolized by exactly 
one agent. That is, unlike the case of uncorrelated information, 
agents do not distribute the production of information among 
multiple agents who produce complementary information. 
Thus, we conclude that information redundancy can have 
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significant impact on the information production behavior at 
equilibrium. 

Several questions arise in networks where cognitive agents 
take joint information production and link formation deci¬ 
sions: what is the fraction of agents producing information at 
equilibrium in an asymptotically large network? What is the 
asymptotic total amount of information in the network? In the 
rest of this subsection, we address these questions and provide 
a characterization for the asymptotic informational behavior of 
agents in a CIN. We investigate the asymptotic behavior of two 
basic quantities: the fraction of agents producing information 
at equilibrium, and the total amount of information in the 
network. 

Denote the set of agents producing information at equilib¬ 
rium by I(s*) = {i \ i Gj\f, and H*{Xi) > 0}. [31] show 
that if agents produce non-redundant information and there is 
no indirect information sharing, then in equilibrium, informa¬ 
tion is produced by only a small subset of agents, and the 
fraction of information producers becomes vanishingly small 
as the network size grows, i.e. lim^v-i-oo supg.gg. = 0. 

[31] calls this “the law of the few”. In the next corollary, we 
characterize the fraction of information producers and the total 
amount of information in the network in Q°° when the link 
cost is large. 

Corollary 2: In the game, when c > kH, we have 


lim 

N—^GO 


l^(s*)| 

N 


= 1 , 


( 12 ) 


for both and F^. For F^, the total amount of information 
in the network in is given by 


H{Xi,X2,...,Xn) = H, (13) 

N^oo 

while for F^ we have 

lim H{Xi,X 2 ,...,Xn) = oo. (14) 

N^CO 

Proof See Appendix L. ■ 

Corollary 2 says that when the link cost is very high, the 
network is fully disconnected and every agent produces the 
information it needs. Thus, when the network is asymptotically 
large, every agent is an information producer no matter what 
the amount of information redundancy is. The number of 
agents producing information is always N. While the number 
of information producers does not depend on F-u, it is clear 
that the total amount of information in the network depends 
on the amount of redundancy. When the agents’ information 
are strongly correlated, the total amount of information is 
always bounded by H. On the other hand, when agents have 
uncorrelated information, the total amount of information in an 
asymptotically large network is unbounded. In the next corol¬ 
lary, we study the case the information production behavior 
when the link cost is low. 

Corollary 3: In the game, when c < kH, the fraction 
of information producers for F^ is given by 

lim sup = 0, (15) 

Ar-s-oog.gg. N 


while for F^ we have 

lim sup = 1. (16) 

For both F^ and F^, the total information in the network is 
lim H{Xi,X 2 ,...,Xn)=H. (17) 

N^oo 

Proof See Appendix M. ■ 

This corollary states that the law of the few introduced 
in [31] does not generally apply in the case of indirect 
information sharing. The applicability of the law of the few 
depends on the link cost and information redundancy. For 
instance, in a network with agents producing highly redundant 
information, the law of the few only applies when the link 
cost is c < kH, whereas for c > kH, all agents will be 
information producers as shown in Fig. 6, where we display 
the network topology at equilibrium with each agen labeled 
by the amount of information it produces for c > kH and 
c < kH. Moreover, there exists information aggregation 
functions in which the network at NE can have all the agents 
being information producers for any link cost, and production 
is no longer dominated by a small set of hub agents. Thus, even 
for low link costs, the applicability of the law of the few is 
still governed by the amount of information redundancy. If the 
agents’ information are strongly correlated, the law of the few 
applies and information production is dominated by a small 
fraction of agents in every equilibrium for c < kH. In contrast, 
when the agents produce non-redundant information, the law 
of the few fails even for low link costs, i.e. c < kH. Fig 7 
depicts the equilibria for an 8-agent network with c < kH 
when the aggregation function is F^ and F^. It is observed 
that the law of the few applies when the aggregation function 
is F^, but fails when the aggregation function is F^. 

Note that while we focused on the extreme cases of informa¬ 
tion redundancy by considering the aggregation functions F^ 
and F^, the analysis can be extended to other generic aggre¬ 
gation functions. Such generic aggregation functions should be 
derived from a real-world network setting (e.g. geographical 
deployment of sensor networks), and an interesting problem 
becomes studying the information production behavior of 
agents under these aggregation functions. However, it is suffi¬ 
cient to only consider F^ and F^ to show that the celebrated 
law of the few does not generally hold whenever information 
redundancy is considered. 

VI. Future work and extensions 

In this section, we propose some potential future research 
directions that capitalize on our model. 

1- Dynamic games with incomplete information: we have 
considered a one-shot complete information game in which all 
agents have knowledge of the entropic vector. An extension 
to our model is to consider a dynamic game with incomplete 
information [23], in which agents learn the entropic vector 
over time by interacting with other agents. In this case, agents 
would pay a link maintenance cost to keep connected to in¬ 
formative agents, and would break links with non-informative 
ones. In such model, the network can be characterized in terms 
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Fig. 6 : Connectivity and information production behavior in a net¬ 
work with strongly correlated information sources. 


Law of the few does not apply Law of the few applies 

Fig. 7 : Exemplary equilibria for different aggregation functions. The 
law of the few does not apply when the agents’ information has no 
redundancies. 


of the probability of emergence of certain topology at NE, and 
the time needed for the network to converge to a steady-state 
topology. 

2- Incorporating capacity-constrained links: we have as¬ 
sumed perfect indirect information sharing among agents. In 
some settings, such as multi-hop relaying networks, informa¬ 
tion sharing can be lossy and the links between agents can 
be capacity-constrained. While lossy benefit flow has been 
modeled before by assuming that benefits are discounted at 
each links [16], in our model lossy information sharing can be 
modeled using an information-theoretic approach by treating 
links as erroneous channels. Incorporating these factors into 
our model can lead to interesting results on both the network 
topology at NE and information production behavior. 

3- New solution concepts: the network formation game 
considered in this paper adopts the NE as a solution concept. 
However, different networks and applications can be better 
suited by different solution concepts. Eor instance, in many 
applications, such as D2D communications, establishing a link 
requires a mutual consent among agents. In this case, pairwise 
stability can be used as a solution concept instead of the NE. 


VII. Conclusions 

In this work, we present a first model for the endogenous 
formation of networks by cognitive agents who aim at gath¬ 
ering and producing information. Using Nash Equilibrium as 
a solution concept, we formulated a non-cooperative network 
formation game where agents get informational benefits by 
forming costly links with each other. We show that the 
information possessed by the cognitive agents affects the 
network topology, efficiency, and information production be¬ 
havior. We show the impact of information redundancy on the 
topologies of NE networks, and its impact on the network 
efficiency in terms of the Price-of-Anarchy and Maximum 
Information Loss. Einally, we consider the asymptotic behavior 
of a network where each agent both produces information 
and forms links with other agents. Eor such networks, we 
study the impact of information redundancy on the number 
of agents producing information at equilibrium. We show that 
the validity of the law of few depends on how information 
aggregates. 

Appendix A 
Proof of Theorem 1 

Prom Nash’s Existence Theorem, we know that if we allow 
mixed strategies, then every game with a finite number of 
players in which each player can choose from finitely many 
pure strategies has at least one Nash equilibrium [38]. Assume 
that agent i adopts a mixed strategy = {pii,Pi 2 , 
where pij is the probability that agent i forms a link with 
agent j, and pa = 0,Vi G Af. The utility of agent i in this 
case is obtained by averaging over all possible networks as 
follows 

N-1 

Ui{Ai) = lUj/(iT(Ai UXaJ) - ^ Pijc, (A.l) 

j=i 1=1 

where aj is an element of the power set of and Wj 

is the probability of the emergence of a network component 
comprising agents in the set {iUaj} based on the mixed 
strategies. Eor instance, in a 2 agent network, the utility 
function of agent 1 is given by 

mi(Ai) = 

(Pl2(l — P2l) +P2l(l — P12) + P12P21) f {H{ACi,X2)) 

+ (1 -Pl 2 )(l -P 2 l)f {H{Xi)) -P12C. 

In this case, wi = pi2{l - P21) +P 2 i(l - P12) + P12P21 
and 1(22 = (1 — Pi 2 )(l — P2i)- Let the NE strategy profile 
be A* = (A*,A;,...,A^), where A* = (p:„p:„ ...,p*^). 
According to (3), the following condition on A* needs to be 
satisfied 

u,(A*,AV) >w,(A„AV),VA, e [0,l]'^,V2eA'. (A.2) 

Now we show that for any agent i, the NE strategy A* needs 
to be a pure strategy for condition (A.2) to be satisfied. We 
focus on agent i with a NE strategy A* = (p*i,p* 2 , ■■■,PiN)> 
where pT G [0,1]. Now assume we induce a perturbation e to 
the mixed strategy of agent i by modifying p*/. to + e for 
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a certain k, where e € [—1 — P*^]- We call this modified 
strategy A*(e). Note that we can write any Wj in (A.l) in the 
form of Wj = Wjp*i. + 'Wj{l —This results in a perturbed 
utility Ui(A*(e)) as follows 

2iV-i_i 

Uz(A*ii))= {'^JiP^k + ^)f {H{XiUXc,j)) 


+w,{l-e-p:f,)f{H{X,UX^^))) 


which can be rearranged as 


(Kfc+e)c- 


N-1 


l=l,l=jik 

(A3) 


u^{A*{e)) = 

'Wi(A-)+ef ^ (wj - Wj)f {H{XiU Xaj)) - c 

Let S = ~^{wj - Wj)f {H{Xi U - c. It can be 

easily shown that > 0 if 5 > 0, and < 0 

otherwise. Thus, if (5 > 0, agent i can always increase its utility 
by increasing e and setting e = 1 — p*^ (and thus playing a 
pure strategy with pik = 1), and if <5 < 0, agent i can always 
increase its utility by setting e = — p*^(and thus playing a pure 
strategy with pik = 0), which contradicts with A* being a NE 
strategy. Thus, for all k G J\f/{i], agent i needs to select a 
pure strategy p*^ G {0,1} for A* to be a best response to 
A*_^ regardless of the strategies of other agents, i.e. non-pure 
strategies are always dominated by a pure strategy. Due to 
symmetry, this applies to all agents in M. Therefore, it follows 
that a pure strategy NE always exists. 



Appendix B 

Proof of Proposition 1 

If the component C is not minimally connected, then it has at 
least one cycle as there exist agents i and j that are connected 
via (at least) two paths i and Pij, 2 , such that any of the 
two paths is not a subset of the other. Eor such component at 
NE, assume that agent v is on path i and agent w is on 
path Pij, 2 - Note that all the agents receive the same amount of 
total information H{C), and we know that there indeed exists 
links: g*^ (or g*J and g^y (or g*^), where agent x G Py,i and 
agent y G Pij, 2 - Now focus on any link of them, say g’^y = 1. 
We observe that agent w can break this link and still receive 
the same benefit by gathering the same amount of information 
from path p^.i, thus receiving a strictly higher utility function 
as it will not pay the cost for the link with agent y, which 
contradicts with the fact that g* is an NE. Thus, a single path 
exists between any two agents. 


Appendix C 
Prooe of Lemma 1 

If there exists an agent in which other agents have an incen¬ 
tive to connect to even if they possess all other information 
in the network, then the network is indeed connected at any 
equilibrium. This is satisfied if and only if the linking cost 
satisfies c < f{H{X)) — f{H{X-i)) for some agent i in 


J\f, i.e. the marginal benefit from connecting to that agent 
is always more than the link cost irrespective to the current 
connections of the agent forming the link. Thus, we must 
have c < max^ /(iJ(A’)) — f{H{X-i)). Hence, part (i) of 
the Lemma follows. 

If no agent have an incentive to form any link, then the 
network is fully disconnected. Erom the monotonicity property 
of the entropy, we know that if agent i has no incentive to 
connect to a set V of agents, then it has no incentive to connect 
to a set if if W C V. Thus, if agent i has no incentive to 
connect to the set M /{ i } via a single link, then it has no 
incentive to form any link in the network. This occurs if and 
only if c > f{H{X)) — f{H{Xi)). If this condition is satisfied 
for all agents, then the network is indeed disconnected, and 
part (ii) of the Lemma follows. 

Appendix D 
Proof of Theorem 2 

Eor the network to be in NE, no agent should have an 
incentive to unilaterally deviate by forming a new link or 
breaking an existing link. We focus on an arbitrary network 
component at NE, say component Ci. Inside this component, 
each agent should either have an incentive to form at least 
one link, or other agents should have an incentive to connect 
to it. Otherwise, this agent can get disconnected (by having 
this agent breaking a link or other agents break their links 
with that agent) from the component while strictly increasing 
its utility or the utility of other agents in the component. 
Thus, we must have either /(iJ(AcJ) — f{H{Xj)) > c 
or f{H{Xci)) - f{H{Xc^/{jj)) > c for all agents j in C*. 
This should apply to all components in the network. Hence, 
condition ( 1 ) follows. 

Now focus on the interaction between different components 
of the network. If any agent in component Ct benefits from 
forming a link to any agent in component Cj , then the network 
is not NE since in this case an agent in Ci can strictly increase 
its utility by unilateral deviation. Hence, we should have 
f{H{XciUCj)) — f{H{Xci)) < c for any two components 
in the network. Thus, condition (2) follows. 

Appendix E 
Proof of Theorem 3 

When the three conditions in Theorem 3 are satisfied, then 
the network is a strict NE since the action of each agent is 
strictly better than any other action, i.e. core agents in each 
component strictly better off when sponsoring the periphery 
agents, and all the periphery agents strictly better off when 
they do not form any links. Thus, no agent is indifferent to 
multiple actions, which implies that the NE is strict. Now we 
prove the converse by showing that if the network is a strict 
NE, the 3 conditions in Theorem 3 must be satisfied. Under 
strict NE, a non-singleton component C has two agents i and 
j such that g*y = 1. Now assume that g^j = 1 for some 
agent k G C. It is clear that k can achieve the same utility by 
deleting its link with j and connecting to i . This contradicts 
with the fact that g* is a strict NE. Thus, gl^ = 0. Using 
a similar argument, it can be shown that g^^ = 0. Thus, we 
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conclude that g*f, = 1. This is true for all agents k € C, 
which implies that a single core agent i forms links with all 
other agents in C. Therefore, the core agent i should strictly 
increase its utility for each of the M — 1 links it forms. The 
marginal utility of agent i from forming a link with agent j 
given that i is connected to all other agents in C is given by 
f{H{Xc)) — f{H{Xc/{jy)) — c. This should be positive for 
all agents j in where C is the set of the M — I periphery 
agents, because otherwise core agent i can break some of the 
links in the component. Thus, for agent z G C to be a core 
agent, and for agents j € C to reside in the periphery, we 
must have f{H{Xc)) — f{H{Xc/{j})) > c, Vj ^ z. Note that 
conditions (1) and (2) in Theorem 2 should also be satisfied 
for the network to be an NE, while the feasibility of organizing 
each component as a core-sponsored star guarantees that the 
network is at strict NE. Thus, strict NE exists if there exists 
an NE with the set C, having a cardinality that is not less than 
M — 1 for all components, i.e. a single core agent can sponsor 
each component. 

Appendix E 
Proof of Lemma 2 

We know that in the K,c region, all the NE networks are 
minimally connected. Eor a minimally connected, each agent 
has an aggregate benefit of f{H{X)) and the total number of 
links is — 1 (total cost is (TV — l)c), thus the social welfare 
of any minimally connected network with strategy profile g is 
given by 

U{^)=Nf{H{X))-{N -l)c. 

In the following we show that this is indeed the maximum 
social welfare in the Xc region, which means that the socially 
optimal network in this region is minimally connected. Note 
that the maximum sum benefits for all agents in the network 
is Nf{H{X)), i.e. all agents share all information, thus any 
connected network maximizes the sum benefit. Recall that in 
the Xc region, we have c < f{H{X)) — /(min^ i7(A'_i)). 
Thus, for any (disconnected) network with less than — 1 
links, the social welfare can always be increased by adding 
a set of links that makes the network (minimally) connected. 
On the other hand, we know from the pigeonhole principle 
that any network with more than N — 1 has cycles, thus 
the social welfare can always be increased by breaking a set 
of links such that all cycles are eliminated while keeping 
the network minimally connected. Therefore, we conclude 
that the social optimal network in the Xc region is mini¬ 
mally connected, and (7 = Nf{H{X)) — {N — l)c. Since 
the social welfare of any NE network in Xc is given by 
C/(g*) = Nf{H{X)) — {N — l)c, then every NE network 
is socially optimal and we have PoA = 1. Next, we focus 
on the Xi region. In this region, any connection will result 
a negative payoff for any agent who forms a link since 
c > f{H{X)) — /(mini iJ(2fi)). Thus, the social optimal is 
a fully disconnected network, which is also the unique (strict) 
NE, and the PoA = 1 in the Xi region. Eor the Xm region, we 
compute an upper bound on the PoA. The lowest social welfare 
of any equilibrium network in the Xi region is lower bounded 
by Efci f{H{Xi)), i.e. infg.ec. C/(g*) > 


with equality when f{H{Xi,Xj)) — f{H{Xi)) < € 

AC, and f{H{X)) — f{H{Xi)) > c, Vz (i.e., agents do 

not get immediate benefit from forming links to individual 

agents, thus a fully disconnected network is an NE since not 

forming a link is a best response for all agents in a fully 

disconnected network). On the other hand, the social welfare 

of the socially optimal network in the Xm region is upper 

bounded by Nf{H{X)), i.e. the social welfare is always 

strictly less than the sum benefit of all agents when they 

possess all the information in the network. Thus, it follows 

that PoA ^ NfjHjX)) 
that PoA < f(H(Xi))' 

Appendix G 
Proof of Coroffary 1 

In the Xc region, we know that all NE networks are con¬ 
nected. Thus, supg.gQ. H{Xi\JX-ji.(^^^y) = infg.gGf. H{XiL) 
~ H{X), and MIL = 0. Similarly, in the Xi region, 
we have supg.g^* U X 7 j.(g.)) = infg.gG*U 

X-riiis*)) = mini//(ATi), thus MIL = 0. In the Xm 
region, the MIL is maximized if both a connected and 
a fully disconnected network are equilibria. In this case, 
supg.gg. H{Xi U XTZi(g*)) = H{X), and infg.gG* H{Xi U 
Xn.lsi)) = min* i7(A:*). Thus, MIL < i7(A’) -min* H{Xi), 
with equality when c > f{H{Xi,Xj)) — f{H{Xi)),yi,j G 
AC, and c < f{H{X)) — f{H{Xi)),Vi G AC. 

Appendix H 
Proof of Lemma 3 

Eor a connected network in the Xc region, the utility of 
agent i is given by zz*(g*) = f{H{X)) - Em 6 A/-.(g*) 

The social welfare is given by C/(g*) = Since 

we know from Proposition 1 that the network is minimally 
connected at equilibrium, then it has exactly A^ — 1 links. 
Therefore, we have U{g*) = N f{H{X)) — 'Yhj^jCj, where 
J is the set of links in the network designated by the index of 
link recipient, and \J\ = N — 1. The social optimal topology 
is the connected network with minimum total link costs, which 
corresponds to a periphery-sponsored star with the agent k = 
arg min^ cj residing in the core of the star. The social welfare 
of such topology is C/(g*) = N f{H{X)) — {N — 1) min^- Cj. 
Note that this is also an NE equilibrium as each agent does not 
benefit from breaking its link with the core agent and linking to 
any other periphery agents. Next, we identify the equilibrium 
with the worst social welfare. Assume that the link costs are 
arranged ascendingly asci <C 2 <03 < ... <CAr_i < cn- We 
know that the network is minimally connected, thus total costs 
of link formation is given by What are the elements 

of the set J such that the total cost is maximized and the 
network is at equilibrium? Note that for the socially optimal 
profile, J = {1,1,1,..., 1}, with a cardinality of A^ — 1. Now 
assume a line network with = 1, VI < i < W. Thus, we 
have g '^2 = 523 “ “ 9n-i,n ~ Thus, = {2,3,..., N— 

1, A^}. It can be easily shown that this line network is stable, 
since no agent z can break its link with agent z -|-1 and increase 
its utility. Eor instance, if i breaks its link with z -f 1, it must 
connect to any agent j > z -f 1 to receive the same amount 
of information but at a higher cost. It can be also shown that 
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this is the worst equilibrium. This is because for a connected 
network, only one agent i connects to the agent N with the 
highest link cost, and others can connect to i (which has a 
lower link cost) and get the information of N via indirect 
sharing. The same applies to agent — 1, where one agent 
connects to it, and others share information by connecting to 
that agent. Thus, to maximize the total link cost and maintain 
equilibrium, only one link is formed with each agent except 
the one with the minimum link cost. Thus, the social welfare 
in this case is NH{X) — + minfeCfc, and the PoA 

formula follows. 

For the /Cj and K,m regions, the proof is the same as that 
of Lemma 2. 


Appendix I 
Proof of Theorem 5 

In the K-c region, the PoA can be written as PoA = 

H(X,)4CL(A-))-(JV-l)min, c, 

1 c^+min*, c;, ' 

shown that if the KL divergence varies from KL(<T) = KLi 
to KL(A’) = KL 2 , where KLi < KL 2 and the values of the 
individual agents’ entropies are fixed, then the PoA increases, 

ie we have = i g(^i)-KLi W)-(Af-l) min^ c, 

we have ^ ff(x,)-KLi)-Ef=i cj+min^ c, 

Nf (12^=1 g(Xi)-KL 2 )-(N-l) miiifc Cfc 

^^(^i)--K-L2)-Ef=i c,-+minfc Ck ' 


< 


Appendix J 
Proof of Theorem 6 

We start with the case of c > kH. Assume that there exists 
a link in g* with g*^ = 1. In this case, agent i can always 
better off by breaking this link and producing an amount H 
of information. This applies to any agent i in Af. Thus, we 
have a unique equilibrium with g*^ = 0, and H*{Xi) = H, 
Vi,j G M. 

Now focus on the case of c < kH. We show that if s 
satisfies (i), (ii), and (hi), then s is an NE. The minimal¬ 
ity of each network component can be easily proved using 
Proposition 1. Now we show that the connected network is an 
NE. In a disconnected network, an agent has to produce an 
amount H of information, which is not optimal since c < kH. 
Thus, no agent in a connected network has incentive to break 
its link and part (i) follows. Since the network is minimally 
connected, then each agent obtains all the total amount of 
information H{X). If H{X) = H, then no agent in the 
component has incentive to alter their information production 
profile because all agents benefit only from obtaining an 
amount H of information. Thus, part (ii) is proved. Einally, if 
c < kH*{X-i),\/i, then no agent in the network has incentive 
to break the link it forms and produce an amount H*{X-i) 
of information on its own. Thus, s is a Nash equilibrium. 

We now prove the converse. Let s be an equilibrium. 
Assume that the network has two components Ci and C 2 . 
The total amount of information in each component must 
be H at equilibrium, thus, any agent with positive amount 
of information production in one component will better off 
by not producing any information and forming a link to 
the other component. Thus, the network is connected in NE 
and part (i) follows. Due to indirect information sharing. 


part (ii) is directly concluded. Einally, if s is an equilibrium 
and gij = 1 , then this should be optimal for agent i, thus 

c < kH*{X_i),\/iAr. 


Appendix K 
Proof of Theorem 7 

The case of c > kH is exactly the same as in Theorem 6 
and the proof will be similar to that in Appendix J. Now focus 
on the case of c < kH. We show that if s satisfies (i), (ii), 
and (iii), then s is an NE. Part (i) follows from Proposition 1 
and the proof of Theorem 6. Now assume that only one agent 
in the network produces H information and all others do not 
produce any information and only form links in the network. 
In this case, the agent producing information does not better 
off by producing any amount of information other than H. In 
addition, the agents forming links do not better off by forming 
new links or breaking their links and producing information 
since c < kH. Thus, part (ii) follows. Since there are N — 1 
agents forming links, then the network is connected, and no 
agent benefits from forming an extra link in the network, which 
concludes part (iii). 

We now prove the converse. Let s be an equilibrium. Due to 
indirect information sharing, part (i) follow straightforwardly. 
Assume that we have two agents with H*{Xi) > H*{Xj) > 
0, then agent j can always better off by setting H* (Xj ) = 0 
since the aggregate information of i and j is H*{Xj). There¬ 
fore, the agent with maximum information production has to 
set H*{Xi) = 77, and all others do not produce information 
and form a link in the network since c < kH. Einally, since s 
is an equilibrium, agents act optimally (their actions are best 
responses to the actions of others), thus each agent from the set 
of TV — 1 non-producers forms exactly one link in the network. 

Appendix L 
Proof of Corollary 2 

Prom Theorem 6, we know that when c > kH, then we 
have a unique equilibrium s* for both and in which 
g*j = 0, VT,j G M, and H*{Xi) = H. Thus, we have 
H*{Xi) > 0,VT G A/", and = 1, which applies when 

the number of agents in the CIN grows to infinity, hence (12) 
follows. Next, we focus on the total amount of information 
in the network. Por F^, we have H{Xi, X2,..., Xn) = 
in.ax.{H, H,..., ii} = H, and (13) follows. Einally, for F^, 
we have H{Xi, X2, ..., Xn) = E^i ^ and (14) 

follows. 

Appendix M 
Proof of Corollary 3 

We start by deriving (15). Prom Theorem 7, we know that 
for F^, every equilibrium has only one information producer. 
When the number of agents grows to infinity, we will still 
have one information producer and = 0. In order to 

prove (16), one needs to find one network in equilibrium 
for F^ in which, for arbitrary N, we have N information 
producers. Consider this network for N agents. Assume that 
H{Xi) = Vi G TV", and the network has a single component 
which is periphery-sponsored star network. Por this network. 
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we have |X(s)| = N. We want to show that this network is an 
NE by showing that every agents strategy is best response 
to all others. It is easy to see that since c < kH, each 
periphery agent has no incentive to break its link with the 
core since > c when N is asymptotically large. 

Moreover, no agent has incentive to alter its information 
production profile since the total information in the network is 
~ Thus, s is an NE. Since this applies to any N, 
(16) follows. Einally, since the network is always connected 
in any equilibrium, then (17) directly follows. 
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